Ontology-Based MEDLINE Document Classification

نویسندگان

Fabrice Camous

Stephen Blott

Alan F. Smeaton

چکیده

An increasing and overwhelming amount of biomedical information is available in the research literature mainly in the form of free-text. Biologists need tools that automate their information search and deal with the high volume and ambiguity of free-text. Ontologies can help automatic information processing by providing standard concepts and information about the relationships between concepts. The Medical Subject Headings (MeSH) ontology is already available and used by MEDLINE indexers to annotate the conceptual content of biomedical articles. This paper presents a domain-independent method that uses the MeSH ontology inter-concept relationships to extend the existing MeSHbased representation of MEDLINE documents. The extension method is evaluated within a document triage task organized by the Genomics track of the 2005 Text REtrieval Conference (TREC). Our method for extending the representation of documents leads to an improvement of 18.3% over a non-extended baseline in terms of normalized utility, the metric defined for the task.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Ontology-Based Feature Transformations: A Data-Driven Approach

We present a novel approach to incorporating semantic information to the problems of natural language processing, in particular to the document classification task. The approach builds on the intuition that semantic relatedness of words can be viewed as a non-static property of the words that depends on the particular task at hand. The semantic relatedness information is incorporated using feat...

متن کامل

Mining and its Application in Biomedical Domain

Semantic Text Mining and its Application in Biomedical Domain Illhoi Yoo Xiaohua Hu, Ph.D A huge amount of biomedical knowledge and novel discoveries have been produced and collected in text databases or digital libraries, such as MEDLINE, because the most natural form to store information is text. In order to cope with this pressing text information overload, text mining is employed. However, ...

متن کامل

Answering Gene Ontology terms to proteomics questions by supervised macro reading in Medline

Motivation and Objectives Biomedical professionals have at their disposal a huge amount of literature. But when they have a precise question, they often have to deal with too many documents to efficiently find the appropriate answers in a reasonable time. Faced to this literature overload, the need for automatic assistance has been largely pointed out, and PubMed is argued to be only the beginn...

متن کامل

Evaluating the effect of unbalanced data in biomedical document classification

Nowadays, document classification has become an interesting research field. Partly, this is due to the increasing availability of biomedical information in digital form which is necessary to catalogue and organize. In this context, machine learning techniques are usually applied to text classification by using a general inductive process that automatically builds a text classifier from a set of...

متن کامل

خوشه‌بندی اسناد مبتنی بر آنتولوژی و رویکرد فازی

Data mining, also known as knowledge discovery in database, is the process to discover unknown knowledge from a large amount of data. Text mining is to apply data mining techniques to extract knowledge from unstructured text. Text clustering is one of important techniques of text mining, which is the unsupervised classification of similar documents into different groups. The most important step...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2007

Ontology-Based MEDLINE Document Classification

نویسندگان

چکیده

منابع مشابه

Ontology-Based Feature Transformations: A Data-Driven Approach

Mining and its Application in Biomedical Domain

Answering Gene Ontology terms to proteomics questions by supervised macro reading in Medline

Evaluating the effect of unbalanced data in biomedical document classification

خوشه‌بندی اسناد مبتنی بر آنتولوژی و رویکرد فازی

عنوان ژورنال:

اشتراک گذاری